Atty. Docket No. 22623-7060 
Genencor Ref. No. GC637-2 

one of skill will appreciate, more complicated randomization schemes can be designed which are 
more compatible with nucleotide-based mutagenesis. 

Codon mutagenesis can be done in equimolar ratios, e.g., for a given site all 
mutagenic oligomers are added in equimolar ratios, or in ratios that relate to the probability 
matrix and/or the constraint vector. For example, one can bias a library in favor of mutations 
which are more likely to result in a functional protein. If desired, wild type oligos can be added 
to adjust the overall frequency of mutagenesis for a position or a region of the target gene. 

In one embodiment, nucleotide-based randomization is used. This method has 
two advantages over synthesizing individual oligos for each substitution: it is less expensive as 
fewer oligos are needed; and the library will contain clones where neighboring (in linear 
sequence) positions have been simultaneously mutated. 

Nucleotide-based mutagenesis can be optimized to produce a desired set of amino 
acids (Goldman & Youvan, Bio/Technology 10:1557 (1992); Huang & Santi, Anal Biochem 
218:454 (1994); Jensen, et al., Nucleic Acids Res 26:697 (1998); and Tomandl, et al., J. Comp.- 
AidedMolec. Design 11: 29 (1997)). These authors did not consider a probability matrix; their 
focus was on inclusion of a desired set of amino acids. Nucleotide mixtures which encode 
amino acids mixtures that optimally conform to the calculated probability matrix and constraint 
vector can be calculated and synthesized. 

Alternatively, portions of a coding region or an entire coding region can be 
chemically synthesized in a codon-by-codon technique using mixtures of activated trinucleotides 
at the positions to be substituted. In this way, only the desired codons are incorporated, 
dysfunctional mutations inevitably resulting from nucleotide-based randomization are avoided, 
and mixtures of adjacent changes can be readily provided. Additionally, controlling the degree 
of incorporation of a given mutation at a given position can be readily accomphshed by varying 
the amount of the particular activated trinucleotides in the mixture for that position. 

Oligonucleotide-driven site-directed mutagenesis can also be used. Suitable site- 
directed techniques include those in which a template strand is used to prime the synthesis of a 
complementary strand lacking a modification in the parent strand, such as methylation or 



3.1/23623-7060 



21 



Atty. Docket No. 22623-7060 
Genencor Ref. No. GC637-2 

incorporation of uracil residues; introduction of the resulting hybrid molecules into a suitable 
host strain results in degradation of the template strand and replication of the desired mutated 
strand. See Kunkel, Proc Natl Acad Sci U S A 1985 Jan;82(2):488-92; QuikChange™ kits 
available from Stratagene, Inc., La Jolla, CA. Mixtures of individual primers for the 
5 substitutions to be introduced can be simultaneously employed in a single reaction to produce 
the desired combinations of mutations. Simultaneous mutation of adjacent residues can be 
accomplished by preparing a plurality of oligonucleotides representing the desired combinations. 
PGR methods for introducing site-directed changes can also be employed. 

Oligos synthesized from mixtures of nucleotides can be used. The synthesis of 
1 0 oligonucleotide libraries is well known in the art. In one alternative, degenerate oligos from 

trinucleotides can be used (Gaytan, et al., Chem Biol 5:519 (1998); Lyttle, et al., Biotechniques 
19:274 (1995); Vimekas, et al., Nucl. Acids Res 22:5600 (1994); Sondek & Shortle Proc. Nat'l 
Acad. Sci. USA 89:3581 (1992)). In another alternative, degenerate oligos can be sjTithesized by 
resin splitting (Lahr, et al., Proc. Nat'l Acad. Sci. USA 96:14860 (1999); Chatellier, et al., Anal. 
1 5 Biochem. 229:282 (1 995); and Haaparanta & Huse, Mol Divers 1 :39 (1995)) 

After the oHgos which incorporate desired protein mutations are constructed, they 
can be assembled with the DNA that encodes the desired protein. Site-directed mutagenesis 
using a single stranded DNA template and mutagenic oligos is well known in the art (Ling & 
Robinson, Anal Biochem 254:1 57 (1997)). It has also been shown that several oHgos can be 

20 incorporated at the same time using these methods (ZoUer, Curr Opin Biotechnol 3: 348 (1992)). 
Single sfranded DNA templates are synthesized by degrading double stranded DNA 
(Strandase"^^ by Novagen). The resulting product after strain digestion can be heated and then 
directly used for sequencing. Alternatively, the template can be constructed as a phagemid or 
Ml 3 vector. Other techniques of incorporating mutations into DNA are known and can be 

25 found in, e.g., Deng, et al.. Anal Biochem 200:81 (1992)). In an alternative embodiment, 
sequences are assembled by PGR fiision from synthetic oHgos (Horton, et al. Gene 77:61 
(1989); Shi, et al., PGR Methods Appl. 3:46 (1993); and Cao, Technique 2:109 (1990)). PGR 
with a mixture of mutagenic oUgos can be used to create the DNA sequences that reflect the 
diversity of the library. 



52061933.1/23623-7060 



22 



Atty. Docket No. 22623-7060 
Genencor Ref. No. GC637-2 

Cassette mutagenesis can also be used in site-directed random mutagenesis. 
Using this technique, a library can be generated by ligating Jfragments obtained by 
oligosynthesis, PGR or combinations thereof. Segments for hgation can, for example, be 
generated by PGR and subsequent digestion with t>'pe II restriction enzymes. This enables 
5 introduction of mutations via the PGR primers. Furthermore, type II restriction enzymes 

generate non-palindromic cohesive ends which significantly reduce the likelihood of ligating 
fragments in the wrong order. Techniques for hgating many fragments can be found in Berger, 
et al.. Anal Biochem 214:571 (1993); and U.S. Pat. App. Ser. No. 09/566,645, filed May 8, 2000. 

T A problem encountered in random mutagenesis is the manufacture of stop codons 

io at the site of diversity. In vitro translation can be used to obtain libraries that are free of stop 
codons or other artifacts (Gho, et al., J Mol Biol 297:309 (2000)). 

The particular chemical and/or molecular biological methods used to construct 
the library are not critical; any method(s) which provide the desired library can be used. For 
= example, oligonucleotides can be inserted into a phage vector so that the phage particle 

=-■1 5 expresses the encoded protein on its surface. Alternatively, one can manufacture a protein array 
wherein the encoded proteins are immobilized on a suitable surface and functional activity is 
assessed and the corresponding protein identified. In yet another embodiment, if the ability of a 
protein to bind to a target is the desired function, a mixture of proteins encoded by the library 
can be contacted with the desired target and the proteins bound identified and sequenced. For 
20 construction of libraries see, US Patent Nos. 6,114,149; 6,107,059; 5,922,545; 5,830,721; 
5,723,323; 5,698,426; 5,571,698; 5,565,332; and PGT Patent Application WO 0046344. 

VI. GHARAGTERIZING THE LIBRARY MEMBERS 

After a library is generated, the members can be characterized and the library 
screened for members that exhibit the desired activity. In addition to finding the desired 
25 functional protein, the information fi-om the screen can be used to design improved probability 
matrix and constraint vectors for a next iteration of mutagenesis and library construction. For 
example, the probability matrix can be improved by determining the mutations in the gene that 
are compatible with expression, folding, and/or stability. Identifying stabilizing mutations or 
combinations of mutations can be of particular importance if library size is very limited by 



52061933.1/23623-7060 



23 



